Author Details

Scroll

Refine your search

Collections

Engineering Collection

Co-Authors

Journals

Year

Authors

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All

Thakkar, Amit

Learning Using Heterogeneous Classifier in Data Mining

Improved K-Means with Dimensionality Reduction Technique

Abstract Views :181 | PDF Views:3

Authors

Amit Thakkar ¹, Nikita Bhatt ¹, Amit Ganatra ¹, Arpita Shah ¹

Affiliations
1 Charotar Institute of Technology Changa, Nadiad, Gujarat, IN

Source

Data Mining and Knowledge Engineering, Vol 3, No 12 (2011), Pagination: 722-725

Abstract

Clustering is the process of finding groups of objects such that the objects in a group will be similar to one another and different from the objects in other groups. K-means is a well known partitioning based clustering technique that attempts to find a user specified number of clusters represented by their centroid. K-means clustering algorithm often does not work well for high dimension; hence, to improve the efficiency, we apply PCA, dimensionality reduction technique, on data set and obtain a reduced dataset containing possibly uncorrelated variables. The challenging task for any clustering method is to determine the number of clusters beforehand. To find the number of cluster, we apply EM method that finds number of clusters user should choose by determining a mixture of Gaussians that fit a given data set. Finally the experiment results shows that the use of techniques such as PCA and EM, improve the efficiency of K-means clustering.

Keywords

Cluster, EM, K-Mean, PCA.

Full Text

Comprehensive and Evolution Study Focusing Future Research Challenges in the Field of Multi Relational Data Mining Specific to Multi-Relational Classification Approaches

Abstract Views :215 | PDF Views:2

Authors

Amit Thakkar ¹, Y. P. Kosta ²

Affiliations
1 Chandubhai S. Patel Institute of Technology, Changa, Gujarat, IN
2 Marwadi Group of Institutions, Rajkot, Gujarat, IN

Source

Data Mining and Knowledge Engineering, Vol 3, No 10 (2011), Pagination: 594-598

Abstract

Most of today’s structured data is stored in relational databases. Thus, the task of learning from relational data has begun to receive significant attention in the literature. Unfortunately, most methods only utilize “flat” data representations. Thus, to apply these single-table data mining techniques, we are forced to incur a computational penalty by first converting the data into this “flat” form. As a result of this transformation, the data not only loses its compact representation but the semantic information present in the relations are reduced or eliminated. As an important task of multi-relational data mining, multi-relational classification can directly look for patterns that involve multiple relations from a relational database and have more advantages than propositional data mining approaches. According to the differences in knowledge representation and strategy, the paper addressed different kind of multi-relational classification approaches that are ILP-based, graph-based and relational database-based classification approaches and discussed each relational classification technology, their characteristics, the comparisons and several challenging researching problems in detail.

Keywords

Multi-Relational Data Mining, Multi-Relational Classification, Inductive Logic Programming (ILP), Graph, Selection Graph, Tuple ID Propagation.

Full Text

Classification using Generalization Based Decision Tree Induction along with Relevance Analysis Based on Relational Database

Abstract Views :196 | PDF Views:3

Authors

Amit Thakkar ¹, Yogeshwar P. Kosta ², Amit Ganatra ²

Affiliations
1 Charotar Institute of Technology Changa, Gujarat, IN
2 Charotar Institute of Technology, Changa, Gujarat, IN

Source

Data Mining and Knowledge Engineering, Vol 2, No 10 (2010), Pagination: 287-293

Abstract

Classification is a process of sorting unknown values of certain attributes-of-interest based on the values of other attributes, and is a major challenge in data mining. A commonly used method is the decision tree. The efficiency of decision tree algorithms has been well established for relatively small data sets. However, this method of classification has problems when handling larger data sets, data having continuous numerical values, and has the tendency to favor multiplicity in terms of values associated with the attributes in the data set while making selection of the final determining attribute. In data mining applications, large training sets are common; therefore decision tree algorithms have limitations of scalability. Also in most data mining application, users have a little knowledge regarding which signature attribute should be selected for effective mining and the user is more dependent upon the capability of the algorithm. In this paper, we address selection of two things, one, the right signature attribute and the second, handle large data set. This we accomplish by proposing a new data classification method through integration of a set of sequential process that involves steps such as data cleaning; attribute oriented induction (identifying the signature attribute), relevance analysis as the preprocessing steps followed by induction of decision trees. This stepwise approach helps us to set simple extraction rules at multiple levels of abstraction and easily handles large data sets and continuous numerical values in a scalable way.

Keywords

Data Mining, Classification, Data Cleaning, Decision Tree Induction, Relevance Analysis.

Full Text

An Improved Expectation Maximization based Semi-Supervised Text Classification using Naïve Bayes and Support Vector Machine

Abstract Views :201 | PDF Views:3

Authors

Purvi Rekh ¹, Amit Thakkar ², Amit Ganatra ³

Affiliations
1 Department of Computer Engineering, Chandubhai S Patel Institute of Technology, Changa, Petlad, IN
2 Department of Information and Technology, Chandubhai S Patel Institute of Technology, Changa, Petlad, IN
3 U & PU Patel Department of Computer Engineering, Chandubhai S Patel Institute of Technology, Changa, Petlad, IN

Source

Artificial Intelligent Systems and Machine Learning, Vol 4, No 5 (2012), Pagination: 330-335

Abstract

With the development of Internet and the emergence of a large number of text resources, the automatic text classification has become a research hotspot. As number of training documents increases, accuracy of Text Classification increases. Traditional classifiers (Supervised learning) use only labeled data for training. Labeled instances are often difficult, expensive, or time consuming to obtain. Meanwhile unlabeled data may be relatively easy to collect. Semi-Supervised Learning makes use of both labeled and unlabeled data. Several researchers have given algorithms for Text Classification using Semi-Supervised Learning. But still improving accuracy of Text Classification using Semi-Supervised Learning is a challenge. In the iterative process in the standard Expectation Maximization (EM) based semi-supervised learning, some unlabeled samples are misclassified by the current classifier because the initial labeled samples are not enough. To overcome this limitation, an EM based Semi-Supervised Learning algorithm using Naïve Bayesian and Support vector machine is proposed in this paper to improve accuracy of text classification using semi-supervised learning.

Keywords

Expectation Maximization (EM), Naïve Bayes (NB), Support Vector Machine (SVM), Semi-Supervised Machine (SSL).

Full Text

A Novel Approach for Making Recommendation using Skyline Query based on user Location and Preference

Abstract Views :161 | PDF Views:0

Authors

Sanket Shah ¹, Amit Thakkar ¹, Sonal Rami ¹

Affiliations
1 Department of Information Technology, CSPIT, CHARUSAT, Anand - 388421, Gujarat, IN

Source

Indian Journal of Science and Technology, Vol 9, No 30 (2016), Pagination:

Abstract

Objectives: To propose a method to handle large number of user and to improve the accuracy and quality of recommendation system. Methods/Statistical Analysis: This paper presents an effective method to identify user location based on his/her preference using Skyline query outline Dominated object. Dominance object suggests that an object falls under good or better in all dimension or good at least one dimension. Skyline query using Recommendation system has increased in recent years. Skyline query using recommendation system mainly used location-based services to find the nearest location, based on user preference. Location-based Services are information services and have a number of uses in social networking. Location-based Service finds the nearest location based on user preferences but not provide location based on similarity and rating. So, the user is not satisfied by the given result. Findings: To resolve above problem, the collaborative filtering technique, K-nearest neighbor algorithm and Ranking Scheme being used by us. Using Collaborative filtering technique, we find the similarity and rating of an item. Using K-nearest neighbor approach finds the nearest distance of the similar item and ranking technique being used by us, to choose the most nearest location. In this paper we take temporary dataset and mathematically evaluate our proposed system. Application/Improvements: In future, we will develop web tool which identify location and display result on map. We will also check user s' past movement history based on content based recommendation system. Skyline query using recommendation system is use various domain i.e. House Rent/buying, travel and tourism business.

Keywords

Collaborative Filtering Technique, Dominated Object, K-Nearest Neighbor, Recommendation System, Skyline Query.

Full Text

Education Data Mining, Visualization and Sentiment Analysis of Coursera Course Review

Abstract Views :109 | PDF Views:0

Authors

Dhaval Bhoi ¹, Amit Thakkar ²

Affiliations
1 U & P U. Patel Department of Computer Engineering., IN
2 Department of Computer Science & Engineering, Chandubhai S. Patel Institute of Technology, Faculty of Technology & Engineering, Charotar University of Science and Technology, Changa – 388421, Gujarat, IN

Source

Journal of Engineering Education Transformations, Vol 36, No 2 (2022), Pagination: 169-177

Abstract

Objective: No Decisions are good or bed they are taken based on the available data. It is very much essential to represent the data in the right form to the right people and at the right time. Higher Engineering Institutes (HEI) is having a plethora of information available to them. Most of the available data are not used properly and remain just as dead storage. Methods: In this study, we have shown the importance of data visualization using a case study on Coursera review dataset. Different useful tools that support improving an Education System are summarized. Sentiment analysis is performed for coursera course review dataset using deep learning method. At the end, dashboard is also created to visualize student data using powerBI tool. Results: Uses of different visualization tools can help to improve the education system and its performance. The Sentiment expressed by students will help to improve the teaching-learning process and research contribution significantly as they are the major components for evaluation when any HEI wants to receive NAAC [National Assessment and Accreditation Council] approval for benefitting all stakeholders of the HEI. Conclusions: Proper analysis of available data and their proper visualization can help us to improve the education system to a great extent in terms of improving the most important factors like student teaching- learning and their placement to make their future bright. Students expressed sentiments are also key features to analyze the success of the teaching-learning process for both teachers and students as well. We have also used our institute students' data to g enerate a d ash bo ard t hat con tain s s tu den t information from a different perspective that can help higher authorities to make better fruitful decisions.

Keywords

Education Data Mining, Dashboard, Data Visualization, Sentiment.

Full Text

References

A picture is worth a thousand words - Wikipedia.(n.d.).Retrieved February 10, 2022, from https://en.wikipedia.org/wiki/A_picture_is_wor th_a_thousand_words

Bhadri,G.N. & Patil, L.R.(2022).Blended Learning: An effective approach for Online Teaching and Learning. Journal of Engineering Education Transformations, 35(Special issue), 53–60.

Cabada,R.Z.,Lucia M. Estrada,B. & Oramas, R.(n.d.).Mining of educational opinions with deeplearning.https://www.researchgate.net/publication/ 331877377

Jha, S. (2020). A case study of implementation of active - cooperative learning approaches introduced through a faculty development programme and their effects on the pass percentage of undergraduate engineering students. Journal of Engineering Education Transformations, 34(1),7–11.https://doi.org/10.16920/jeet/2020/v34i1/15500 7

KABIR, A. I., KARIM, R., NEWAZ, S., & HOSSAIN, [6] M. I. (2018). The Power of Social Media Analytics: Text Analytics Based on Sentiment Analysis and Word Clouds on R. Informatica Economica,22(1/2018),25–38.https://doi.org/10.12948/issn14531305/22.1.20 18.03

Kaggle: Your Machine Learning and Data Science Community. (n.d.). Retrieved February 10, 2022, from https://www.kaggle.com/

Krishnan, V. (2017). IR @ INFLIBNET: Research Data Analysis with Power BI. https://ir.inflibnet.ac.in/handle/ 1944/2116

Liu, B. (2012). Sentiment Analysis and Opinion Mining. Morgan & Claypool Publishers.

Nadj, M., Maedche, A., & Schieder, C.(2020). The effect of interactive analytical dashboard features on situation awareness and task performance. Decision Support Systems,135,113322.https://doi.org/10.1016/J.DSS.2020.113322

Rajarapollu, P. Bansode, N. V. & Katkar, V.(2022).ICT-A Tool to Enhance Teaching Learning Activity in Technical Education. Journal of Engineering Education Transformations,35(Special Issue),14–18.

Sapountzi, A. & Psannis, K. E.(2018). Social networking data analysis tools & challenges. Future Generation Computer Systems, 86,893–913.https://doi.org/10.1016/j.future.2016.10.019

Zentner, A., Covit, R., & Guevarra, D. (2019). Exploring Effective Data Visualization Strategies in Higher Education. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3322856.

Username
Password
Remember me

Informatics Publishing Limited

Author Details

Thakkar, Amit

Learning Using Heterogeneous Classifier in Data Mining

Authors

Source

Abstract

Keywords

Full Text

Improved K-Means with Dimensionality Reduction Technique

Authors

Source

Abstract

Keywords

Full Text

Comprehensive and Evolution Study Focusing Future Research Challenges in the Field of Multi Relational Data Mining Specific to Multi-Relational Classification Approaches

Authors

Source

Abstract

Keywords

Full Text

Classification using Generalization Based Decision Tree Induction along with Relevance Analysis Based on Relational Database

Authors

Source

Abstract

Keywords

Full Text

An Improved Expectation Maximization based Semi-Supervised Text Classification using Naïve Bayes and Support Vector Machine

Authors

Source

Abstract

Keywords

Full Text

A Novel Approach for Making Recommendation using Skyline Query based on user Location and Preference

Authors

Source

Abstract

Keywords

Full Text

Education Data Mining, Visualization and Sentiment Analysis of Coursera Course Review

Authors

Source

Abstract

Keywords

Full Text

References